A simple DOP model for constituency parsing of Italian sentences

نویسنده

  • Federico Sangati
چکیده

We present a simplified Data-Oriented Parsing (DOP) formalism for learning the constituency structure of Italian sentences. In our approach we try to simplify the original DOP methodology by constraining the number and type of fragments we extract from the training corpus. We provide some examples of the types of constructions that occur more often in the treebank, and quantify the performance of our grammar on the constituency parsing task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New DOP Model for Phrase-structure Parsing of Persian Sentences

In this paper we employ a most recent approach to Data Oriented Parsing (DOP), which has named Double-Dop, for Persian sentences. Like other DOP models, Double-Dop parser utilizes syntactic fragments of arbitrary size from a treebank to analyse new sentences, but it extracts a restricted yet representative subset of fragments. It uses only those which are encountered at least twice. The accurac...

متن کامل

Accurate Parsing with Compact Tree-Substitution Grammars: Double-DOP

We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utilizes syntactic fragments of arbitrary size from a treebank to analyze new sentences, but, crucially, it uses only those which are encountered at least twice. This criterion allows us to work with a relatively small but representative set of fragments, which can be employed as the symbolic backbone ...

متن کامل

Evalita’09 Parsing Task: constituency parsers and the Penn format for Italian

The aim of Evalita Parsing Task is at defining and extending the state of the art for parsing Italian by encouraging the application of existing models and approaches. Therefore, as in the first edition, the Task includes two tracks, i.e. dependency and constituency. This second track is based on a development set in a format, which is an adaptation for Italian of the Penn Treebank format, and ...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Tree Kernels-based Discriminative Reranker for Italian Constituency Parsers

English. This paper aims at filling the gap between the accuracy of Italian and English constituency parsing: firstly, we adapt the Bllip parser, i.e., the most accurate constituency parser for English, also known as Charniak parser, for Italian and trained it on the Turin University Treebank (TUT). Secondly, we design a parse reranker based on Support Vector Machines using tree kernels, where ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009